

orney's Docket No.: 042390.P7512

**PATENT** 

### IN THE UNITED STATES PATENT AND TRADEMARK OFFICE

| In Re Patent Application of:                                                                                | )           |           |                   |
|-------------------------------------------------------------------------------------------------------------|-------------|-----------|-------------------|
| Orna Etzion                                                                                                 | )           | Examiner: | Meonske, Tonia L. |
| Application No.: 09/676,175                                                                                 | )           | Art Unit: | 2183              |
| Filed: September 29, 2000                                                                                   | )           |           |                   |
| For: A Method and Apparatus for<br>Generating an Expected Top of<br>Stack During Instruction<br>Translation | )<br>)<br>) |           |                   |

Commissioner for Patents P.O. Box 1450 Alexandria, VA 22313-1450

### **DECLARATION UNDER 37 C.F.R. §1.131**

Sir:

### I, Orna Etzion, declare that:

- 1. I am the inventor of claims 1, 3-6, 8-11 and 13-15 of the above identified patent application.
- 2. Prior to June 16, 2000, I conceived the idea of method and apparatus for generating an expected top of stack during instruction translation as described and claimed in my application.
- 3. An Intel Invention disclosure, dated July 7 1999 (copy attached hereto as Exhibit A), which describes an embodiment of the invention, was prepared by myself as a submission to our legal team for consideration for filing a patent application. The invention disclosure describes the operation of generating an

expected top of stack during instruction translation, as is described and claimed in our application.

- 4. Sometime thereafter, the Intel patent legal team considered the invention disclosure and approved the invention disclosure for filing as an application in the United States.
- 5. Sometime thereafter, I traveled from Israel to the United States in the spring of 2000 to meet with our patent attorney to discuss the invention of the above identified patent application, as part of our continuous effort in preparing a draft of the above identified patent application.
- 6. A draft of the above-identified patent application was forwarded to myself, via the email from John Ward on June 24, 2000. I received and reviewed the draft of the patent application, and provided my feedback on the draft on July 3, 2000. (Copies of the emails are attached hereto as Exhibit B)
- 7. Following subsequent back and forth communications between myself located in Israel and the attorney located in the California, I believe the above-identified patent application was filed thereafter with the PTO on September 29, 2000.
- 8. We declare, to the best of our knowledge, all statements made in this document are true, and that all statements made on information are believed to be true; and further, that these statements were made with the knowledge that willful false statements are punishable by fine or imprisonment, or both, under § 1001 of Title 18 of the United States Code and that such willful false statements may jeopardize the validity of the above-identified patent application or any patent issued thereon.

| Date: | February 1, 2005 |  |
|-------|------------------|--|
|       |                  |  |

# Comm | MPG / MPL FILE INTEL INVENTION DISCLOSURE

JUL - 7 1999

July 7, 1999

DATE:

|      |     | 11       | /\           | 7   | 1 |
|------|-----|----------|--------------|-----|---|
| EGAL | ID# | <u>_</u> | $\mathbf{O}$ | / _ | 5 |

| lame: <u>Oma Etzion</u>                                                                                                                                                                                    |                                                                                                                                            | SS                                                                                                                          |                                                                                                                         |                                                                 |
|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------|-----------------------------------------------------------------|
| mpl. No. 10122359                                                                                                                                                                                          | Dept.# 6985                                                                                                                                | <del></del>                                                                                                                 | Phone 4865-5720                                                                                                         | _ M/S: IDC-1 D                                                  |
| lome Address: 5 Kariv st.                                                                                                                                                                                  |                                                                                                                                            |                                                                                                                             |                                                                                                                         | DEOD                                                            |
| Citizenship: Israel                                                                                                                                                                                        | Supervisor*                                                                                                                                | Yaron Sheffer                                                                                                               | M/S: IDC-1 D                                                                                                            | _ RECEIN                                                        |
| Broup Name: MPL                                                                                                                                                                                            |                                                                                                                                            | Division Name: !                                                                                                            | MPG                                                                                                                     | JUL 0 8 1                                                       |
|                                                                                                                                                                                                            |                                                                                                                                            |                                                                                                                             |                                                                                                                         |                                                                 |
| lame: ?                                                                                                                                                                                                    | SS#                                                                                                                                        | N/A                                                                                                                         | <del></del>                                                                                                             | PATENT DATABAS                                                  |
|                                                                                                                                                                                                            |                                                                                                                                            |                                                                                                                             | Phone ? M/S                                                                                                             | IDC-1 DINTEL LEGAL                                              |
| lome Address:                                                                                                                                                                                              |                                                                                                                                            |                                                                                                                             | <del></del>                                                                                                             |                                                                 |
| · ———                                                                                                                                                                                                      |                                                                                                                                            |                                                                                                                             |                                                                                                                         | 9 M/S: IDC-1 D                                                  |
| Group Name: ML                                                                                                                                                                                             |                                                                                                                                            | Division Name: I                                                                                                            | MPG                                                                                                                     | <del></del>                                                     |
| ne technique has been impoint stack simulation.                                                                                                                                                            | lemented in a dynan                                                                                                                        | nic IA32→IA64 bir                                                                                                           |                                                                                                                         | rocess:<br>urrently a research project, fo                      |
| he technique has been imploint stack simulation.  a) Has a description of you  NO: YES:                                                                                                                    | r invention been, or                                                                                                                       | nic IA32→IA64 bir<br>will it shortly be, p                                                                                  | ublished outside Intel:                                                                                                 | urrently a research project, fo                                 |
| ne technique has been impoint stack simulation.  a) Has a description of you  NO: YES:                                                                                                                     | r invention been, or  X  DATE                                                                                                              | nic IA32→IA64 bir<br>will it shortly be, p<br>E WAS OR WILL E<br>-publication appro                                         | ublished outside Intel: BE PUBLISHED: val? YES: X NO                                                                    | urrently a research project, fo                                 |
| ne technique has been implicated in the technique has been implicated in the technique has been implicated in the technique has a description of you not               | r invention been, or X DATE  Thirt submitted for pre- used/sold or planne                                                                  | nic IA32→IA64 bir<br>will it shortly be, p<br>E WAS OR WILL E<br>-publication appro<br>d to be used/sold                    | ublished outside Intel: BE PUBLISHED: val? YES:X No                                                                     | urrently a research project, fo                                 |
| ne technique has been impoint stack simulation.  a) Has a description of you NO: YES:  If YES, was the manuscription been NO: YES:                                                                         | r invention been, or X DATE  ipt submitted for pre- used/sold or planne X DATE                                                             | mic IA32→IA64 bir<br>will it shortly be, p<br>E WAS OR WILL E<br>-publication appro<br>d to be used/sold<br>E WAS OR WILL I | ublished outside Intel: BE PUBLISHED: val? YES: _X No by Intel or others? BE SOLD: may be used                          | 10/99  in future implementations of                             |
| ne technique has been impoint stack simulation.  a) Has a description of you  NO: YES:  If YES, was the manuscr  b) Has your invention been  NO: YES:  yet on plan of record.  invention conceived, or cor | r invention been, or a X DATE  ipt submitted for pre- used/sold or planne X DATE  instructed during performe and number  form, DATED AND S | will it shortly be, p  E WAS OR WILL E  publication appro d to be used/sold E WAS OR WILL I  ormance of a gove              | ublished outside Intel: BE PUBLISHED: val? YES: X NO by Intel or others? BE SOLD: may be used emment or third party con | 10/99  D: in future implementations of tract, please check here |

LE0299A/10-26-93

### General purpose of the invention

The purpose of this invention is to efficiently maintain synchronization of a simulated circular register stack. The invention may be valuable for binary translation, from source computer architecture that contains such a stack, to a target architecture that supports a flat register file. The invention may be used in dynamic or static binary-translators, as well as in architectural simulators or virtual-machine implementations using similar, code-generation-based, techniques. In particular, the invention provides a significant performance advantage when translating Intel Architecture floating-point code to any other architecture.

### Advantages of the invention over what is done now

The invention is significantly faster than any known alternative.

Emulating a stack rotation by multiple move operations need to perform those moves for any stack push or pop. The number of the required moves per occurrence is the size of the stack, and they contain a lot of internal dependencies. the proposed invention, the rotation moves are performed only on extremely rare cases.

Emulating a stack in memory suffers from a great load-store overhead, which the proposed invention avoids.

### Essential elements or key to the invention

The following section demonstrates the key elements of the invention using, as an example, an IA32→IA64 binary translator. The relevant aspect is the emulation of IA32 floating-point (FP) register stack, using the flat FP register-file of an IA64 target machine.

References to the eight physical FP-registers of the Intel IA32 architecture are always stack-relative. The mapping between stack-relative references and physical registers changes dynamically. For example, the physical registers corresponding to ST(0) before and after executing an FLD instruction are different, since FLD pushes a value onto the FP-stack.

However, in the vast majority of practical cases, <u>multiple run-time entries to the same code block repeat the same stack-depth at entrance</u>. Speculating the state at the entry point allows an effective *static* mapping between any IA32 FP register-references in the block and the corresponding IA64 FP-registers. To take advantage of such a speculative approach, the following mechanisms are supported:

- 1. Stack depth speculation effectively guessing the run-time stack state at all or almost all entries to the block. The speculation is done prior to the block translation. Dynamic translator uses the 1<sup>st</sup> run-time entry state (which is already known when the block is reached). Static translator has to perform code analysis and walk-through to predict the entrance state effectively.
- 2. Tracking the speculation realization keeping the actual run time stack state and verifying that the speculative assumption (taken at the translation of the block) is indeed true at each run-time entry. The actual stack depth is updated at the end of the block execution, which is a single operation that reflects the overall effect of the entire block. If the block is balanced (same number of pushes and pops), this code is eliminated. At the beginning of each block, a checking code is executed, that compares the assumed (speculated) stack depth with the actual one.
- 3 Recovery mechanism ensure correct operation when the check fails. The recovery is achieved by actual rotation (copy of register values), so the actual top-of-stack moves to fit the expected one. The block code remains as is. This method of recovery ensures that the penalty does not propagate: When control is transferred to the next block, the correction is already done, and the stack-depth expected by the next block matches the actual depth.

<u>Note:</u> This invention disclosure does not describe how stack exception conditions are detected. The solution to that problem is covered by another patent disclosure.

### Example

The example in the following page consists of 2 very simple floating-point blocks. It shows the behavior of the translation mechanism at the regular case (when the expected Top-Of-Stack equals the actual one), and on the special case (when they are different). Note that L2 block is balanced, hence no update of the actual TOS value is done at its epilogue. Also note that the correction done for L1 (on the special case) does not affect the normal flow at L2. The Actual TOS value is best held in a global integer register (but not necessarily).

As already stated, although the example refers to IA32→IA64 translation, the invention principles are applicable to any other case of emulating a rotating stack by a static register file.

### Value of the invention to Intel: how will it be used?

This invention is valuable to Intel because it can be use to significantly speed up the floating-point performance of IA32—IA64 dynamic binary translation. Such a project currently exists as a research project, but the technology is expected to eventually enter a commercial product of strategic importance to Intel.

LE0299A/10-26-93 REV. 7

### Example





|   | After L<br>Expects<br>Actual | ed TOS<br>TOS | = 6<br>= 6 |   |
|---|------------------------------|---------------|------------|---|
| ı |                              | Value         | Target     | 7 |
| Ì | ST(1)                        | С             | 127        | 1 |
|   | ST(0)                        | AB            | £26        |   |
| 1 | ST(7)                        | *             | £25        | - |
|   | ST(6)                        | *             | f24        | 1 |
| - | ST(5)                        | *             | f23        |   |
|   | ST(4)                        | •             | f22        |   |
| 1 | ST(3)                        | •             | £21        |   |
|   | ST(2)                        | •             | f20        |   |

# Code Block L2 Source: L2: FADDP ; //pop FLDE {eax}; //push JMP L3 Translated pseudo-code: L2: Cmp 6, Actual\_TOS NE ? BR Correct f27 = f26 + f27 flde f26 = [r20] BR L3

After L2 execution: Expected TO3 = 6 Actual TOS = 6 Source Value | Target **ST(1)** AB+C **£27** ST(0) X **£**26 ST(7) **f25** ST(6) £24 ST(5) **f23** ST(4) **f**22 ST(3) 21 ST(2) 20





Code Block L1

Correction pseudo-code
Delta = Expected\_TOSActual\_TOS
Rotate\_stack(Delta)
Return (to L1)

| ĺ | Expected TOS = 5 Actual TOS = 5 |       |             |  |  |
|---|---------------------------------|-------|-------------|--|--|
| ı | Source                          | Value | Target      |  |  |
| ı | ST(2)                           | С     | £27         |  |  |
| H | ST(1)                           | В     | £26         |  |  |
| 1 | ST(0)                           | Α     | £25         |  |  |
| ı | ST(7)                           | •     | £24         |  |  |
| 1 | ST(6)                           |       | £23         |  |  |
| I | ST(5)                           | •     | £22         |  |  |
| ı | ST(4)                           |       | <b>f</b> 21 |  |  |
|   | ST(3)                           | D     | £20         |  |  |

After correction code:

# Code Block L1

Source:

L1: FMULP ; //pop JMP L2

Translated pseudo-code:

L1: Cmp 5, Actual\_TOS NE ? BR Correct f26 = f26 \* f25 Actual\_TOS = 6 BR L2

| After L1 execution:<br>Expected TOS = 6 |       |        |   |
|-----------------------------------------|-------|--------|---|
| Actual                                  | TOS   | = 6    | l |
| Source                                  | Value | Target | 1 |
| ST(I)                                   | C     | 127    | 1 |
| ST(0)                                   | AB    | £26    | - |
| ST(7)                                   | •     | £25    |   |
| ST(6)                                   | •     | £24    |   |
| ST(5)                                   | •     | £23    |   |
| ST(4)                                   | •     | £22    |   |
| ST(3)                                   | •     | £21    | ŀ |
| ST(2)                                   | D     | f20    |   |

# Code Block L2

Source:

L2: FADDP ; //pop FLDE (eax); //push JMP L3

Translated pseudo-code:

L2: Cmp 6, Actual\_TOS NE ? BR Correct f27 - f26 + f27 flde f26 - [r20] BR L3

|   | After L2 execution:  Expected TOS = 6  Actual TOS = 6 |       |        |  |  |
|---|-------------------------------------------------------|-------|--------|--|--|
| ļ |                                                       | Value | Target |  |  |
| ١ | ST(1)                                                 | AB+C  | 127    |  |  |
| H | ST(0)                                                 | X     | £26    |  |  |
| ļ | ST(7)                                                 | •     | £25    |  |  |
| ı | ST(6)                                                 | •     | £24    |  |  |
| I | ST(5)                                                 | •     | £23    |  |  |
| ١ | ST(4)                                                 | •     | f22    |  |  |
| İ | ST(3)                                                 | •     | £21    |  |  |
| l | ST(2)                                                 | D     | f20    |  |  |

LE0299A/10-26-93

REV. 7



### "Etzion, Orna" <orna.etzion@intel.com> on 07/03/2000 03:44:32 AM

To: John Ward/Bstz

cc: "Etzion, Orna" <orna.etzion@intel.com>

Subject: RE: Patent applications

### Hi John,

Here are my comments on the draft:

- 1. page 3 line 8: the overhead is reduced (not eliminated), especially for the "on the fly case" where the translation itself is part of the overhead.
- 2. page 4: waiting for the drawings to be faxed.
- 3. page 5 line 19: in "programs" I understand that you mean the programs who are being emulated/translated.
- 4. page  $\bar{5}$  line 23: Stacks may keep ... is only an example so maybe should be mentioned under the for example (in line 24).

page 6 line 1: I did not like the "used in this way". I did not understand what you mean by this.

I undestood that from page 6 line 6 to page 7 line 6 you describe what it means to use the stack in the original program in an architecture that has a HW built in stack. Following comments are based on this undersatnding.

page 6 line 6: What is missing in this paragraph is the explanation that the instructions that refer to the stack are refering to relative to TOS based operands. They do not refer to ST0, ST1 etc. but will always refer to TOS, TOS-1 etc. In the example in terms of the instructions there will be no difference between the 1st instruction which will push the element into ST0 and the 2nd instruction which pushes the element into ST1. Both will be pushing into the TOS. It is the HW which maintains the identity of the current TOS (knowing which of the physical entries (ST0-ST4) it currently is.

page 6 line 14: the TOS is not passed from one BB to the next in the original programs. The original program expects the HW to maintain TOS. The important point in the paragraphs that discuss BBs is that it is possible (in the original program) to enter a BB when TOS is a different physical register. The original code will work ok, because the HW will maintain the correct TOS.

page 7 line 5: Again, the original program does not care that the TOS can change from one execution of the BB to the next. The HW will ensure that the instructions will use and set the appropriate physical registers.

I undesrtood that from page 7 line 9 to line 24 you describe the general

## Regards, Oma

----Original Message-----

From: John Ward [mailto:John\_Ward@bstz.com]

Sent: Saturday, June 24, 2000 3:00 AM

To: oma.etzion@intel.com Subject: Patent applications

Oma, enclosed is a rough first draft of the patent application originally entitled "maintaining synchronization of a simulated circular-stack of registers during binary translation". Please send me you fax number so that I can fax the figures to you. I need to file the application June 30th. Please let me know when is convenient to discuss your comments/revisions on the draft.

Regards,
-John

(See attached file: P7512 Patent application.ver1.doc)